Effects of Pause Insertion on the Intelligibility of Low Quality Speech

نویسندگان

  • Peter J. Scharpff
  • Vincent J. van Heuven
چکیده

The intelligibility of the Output of text-to-speech Systems today is generally poorer than that of natural human speech. One way of improving the quality of eynthetic speech is to insert speech pauses at selected positions in the utterances rather more frequently than the human reader would choose to do. Pause insertion has been reported to improve intelligibility in deaf-speech [1] äs well äs in speech synthesized from diphones [2]. In the studies cit^d here, the pauses were inserted at breaks in the syntactic structure of the spoken sentences, without explicit motivation for this particular pausing strategy. Although the choice is intuitively appealing, there may be other pausing strategies that are perceptually äs adequate but easier to use in text-to-speech applications. The present research, therefore, aims to establish an optimal pause insertion strategy for low-quality speech synthesis. There are at least two criteria that have to be considered when choosing a pausing strategy. Firstly, the pauses should convey äs much useful Information to the listener äs possible; secondly, the positions where the pauses are to be inserted should be detectable by a simple algorithm that can easily be incorporated in a text-to-speech System. Generally, the more useful the Information signalled by the speech pause, the harder it is to find its Position automatically. In our experiments we Bystematically examined the effects of four pausing strategies, which will be discussed in our next sections in ascending order of complexity. The first strategy is one without pauses. The second strategy has pauses at word boundaries at regulär intervals of six words. Here the listener gets Information about some word beginnings in the sentence. In the third strategy we have marked word boundaries before Content words so that Information is given to the listener about beginnings of relatively important words in a sentence. In the last strategy we marked boundaries so äs to reveal the prosodic structure of the sentence. The speech pauses in this version are located at the end of intonational or phonological phrases. We expect this last strategy to be the most helpful to improve the intelligibility of low quality speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech difficulties in Joubert syndrome

Introduction: "Joubert syndrome" was first introduced in1969. This syndrome is a rare genetic disease with autosomal dominantpattern. Hypotonia, ataxia and motor delay of the disease known as clinical manifestations. In the few reports of this syndrome, mostly functional and structural components studied and radiographic images such as speech and language developmental delay symptoms has been l...

متن کامل

Pause Prediction from Text for Speech Synthesis with User-Definable Pause Insertion Likelihood Threshold

Predicting the location of pauses from text is an important aspect for speech synthesizers. The accuracy of pause prediction can significantly influence both naturalness and intelligibility. Pauses which help listeners to better parse the synthesized speech into meaningful units are deemed to increase naturalness and intelligibility ratings, while pauses in unexpected or incorrect locations can...

متن کامل

Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints

Speakers appear to adopt strategies to improve speech intelligibility for interlocutors in adverse acoustic conditions. Generated speech, whether synthetic, recorded or live, may also benefit from context-sensitive modifications in challenging situations. The current study measured the effect on intelligibility of six spectral and temporal modifications operating under global constraints of con...

متن کامل

Automated Pause Insertion for Improved Intelligibility Under Reverberation

Speech intelligibility in reverberant environments is reduced because of overlap-masking. Signal modification prior to presentation in such listening environments, e.g., with a public announcement system, can be employed to alleviate this problem. Time-scale modifications are particularly effective in reducing the effect of overlap-masking. A method for introducing linguistically-motivated paus...

متن کامل

Effects of Unnatural Pause on Speech Intelligibility

Effects of unnatural pausing on spoken sentence intelligibility by native and non-native listeners of American English were studied in a framework of 6 different signal-to-noise ratios. Performance between pause and non-pause sentences differed significantly for native listeners at 0 dB and –2 dB signal-tonoise ratios, but no significant difference was obtained from non-native listeners. The te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005